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Abstract 

One of the greatest challenges in biomedical research, drug discovery and diagnostics is understanding how seemingly 
identical cells can respond differently to perturbagens including drugs for disease treatment. Although heterogeneity has 
become an accepted characteristic of a population of cells, in drug discovery it is not routinely evaluated or reported. The 
standard practice for cell-based, high content assays has been to assume a normal distribution and to report a well-to-well 
average value with a standard deviation. To address this important issue we sought to define a method that could be 
readily implemented to identify, quantify and characterize heterogeneity in cellular and small organism assays to guide 
decisions during drug discovery and experimental cell/tissue profiling. Our study revealed that heterogeneity can be 
effectively identified and quantified with three indices that indicate diversity, non-normality and percent outliers. The 
indices were evaluated using the induction and inhibition of STAT3 activation in five cell lines where the systems response 
including sample preparation and instrument performance were well characterized and controlled. These heterogeneity 
indices provide a standardized method that can easily be integrated into small and large scale screening or profiling 
projects to guide interpretation of the biology, as well as the development of therapeutics and diagnostics. Understanding 
the heterogeneity in the response to perturbagens will become a critical factor in designing strategies for the development 
of therapeutics including targeted polypharmacology. 
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Introduction 

One of the greatest challenges in drug discovery and 
development is understanding how seemingly identical cells 
respond differendy to drug treatment [1]. In cancer, the 
prevalence of intra-tumor genetic and phenotypic heterogeneity, 
results from clonal evolution [2,3], epigenetic plasticity[4] and 
variation in tumor microenvironments [5] and suggest that a single 
drug targeting a single driver is not lilcely to adequately control 
disease progression [6]. The complexity of the tumor microenvi- 
ronment, which extends to stromal cells, including immune cells, 
may contribute significantly to the development of resistance to 
treatment [5]. Efforts to recapitulate the in vivo tumor microen- 
vironment in physiologically relevant models will require an- 
alytical approaches that address the heterogeneity in the model 
[7,8] . However, cellular heterogeneity is not limited to cancer cells. 



but is exhibited even in normal, clonal cell lines, and the impact of 
heterogeneity extends from basic biology to drug discovery and 
diagnostics [9-1 1]. 

It is now understood that there are multiple sources of 
heterogeneity in cell populations including both genetic and 
non-genetic factors. Genetic variation is well studied [4,12,13]. 
Non-genetic heterogeneity, also referred to as phenotypic hetero- 
geneity, is variability of one or more phenotypes or traits within a 
clonal population [9]. Non-genetic heterogeneity has been 
organized into a hierarchy of dichotomies starting with extrinsic 
versus intrinsic factors [9]. Variation in extrinsic factors results 
from variation in the cellular microenvironment. Intrinsic 
heterogeneity arises from intracellular factors, even in a uniform 
environment, and can be further subdivided into macro- and 
micro-heterogeneity [9] . The former refers to the variability in one 
or more cell traits that result in discrete phenotypes and the latter 
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to the apparently continuous random variation within a single 
phenotype. It is widely accepted that non-genetic heterogeneity 
plays an important biological role in cell behaviors such as cell fate 
decision in stem cells, development and cellular physiology [9-1 1]. 
It is also of increasing interest in tumor diagnostics, therapeutics 
and disease management, as well as drug discovery and 
development [14—17]. 

A major opportunity in drug discovery is to apply a quantitative 
systems pharmacology (QSP) approach to modulating the 
biochemical networks that are involved in disease, in contrast to 
identifying and validating a single molecular target up front [18- 
20]. High Content Analysis (HCA) [21,22], flow cytometry [23], 
single cell genomics [24] and other "phenotypic" methods provide 
the capability to measure multiple biomarkers in large numbers of 
individual cells. In particular, HCA can be used to profile 
individual cells within tissues and small animal models, as well as 
in 2D and 3D arrays of cells [15,25]. However, it has been 
standard practice in HCA to reduce the detailed cellular data to a 
population average (well average) that is intended to characterize 
the overall response of the cells, assuming a normal distribution 
[26]. 

The plate-to-plate and the day-to-day variabilities of HCA 
measurements are usually characterized by the Z' factor or the 
stricdy standardized mean diflerence (SSMD) [26-29]. These 
metrics assume a normal distribution of the well average data [30] . 
However, there has not been a similar effort in HCA to address 
phenotypic heterogeneity in a simple, standard and quantitative 
manner amenable to medium to high throughput screening. There 
have been multiple studies in which cellular heterogeneity was 
evaluated and characterized. For example, classifiers were trained 
to identify subpopulations based on collections of phenotypic 
features. In some cases the subpopulations were characterized by 
the median and interquartile range [31]. In addition, an analysis 
based on visual analytics combining parallel-coordinate plots, used 
for a visual assessment of the high-dimensional dependencies, and 
nonlinear support vector machines, for the quantification of 
heterogeneity, has also been demonstrated [32]. A heterogeneity 
scoring approach (HetMap) was designed to visualize the 
heterogeneity within an individual patient's breast tissue based 
on immunohist(){ h("mistry in the context of a patient population 
[33]. Furthermore, analytical tools such as Kolmogorov-Smirnov 
(KS) statistics, machine learning, and univariate and multivariate 
analyses have been applied to analyze perturbations in cells with 
drugs and siRNA [34-43]. These analytical tools have been 
valuable for characterizing heterogeneity and demonstrating the 
value of heterogeneity analysis in drug discovery, pathway analysis 
and diagnostics, but are not optimal for routine evaluation of 
large-scale screens or profiles. 

The goal of the present paper is to describe a method for the 
analysis of cellular heterogeneity in cellular phenotypes that 
includes: developing a set of "indices" to identify, quantify and 
characterize heterogeneity in a way that it can he easily included 
in all screening and cellular profiling; as well as to demonstrate an 
optimal data representation to visualize the fuU range of 
heterogeneity in the data when it is identified. We use 
heterogeneity in the activation of STAT3 as a model system for 
developing and testing indices and show how the heterogeneity 
indices can be used in high throughput biology and drug discovery 
to quantify, compare and flag studies in which: 1) there is a high 
degree of variability in the cellular responses, 2) results suggest 
there is more than one subpopulation, or 3) there are more than 
the expected number of outiiers. This important information wiU 
also be essential to interpreting ceUular responses in multiplexed, 
2D and 3D screens, as weU as within more complex microenvi- 



ronments in vivo and in vitro, in physiologically relevant disease 
and organ models, as well as patient samples. 

Materials and Methods 

Cell culture 

Cal33 human head and neck squamous cell carcinoma 
(HNSCC) cells [44,45] were kindly provided by Dr. Gerard 
Milano (University of Nice, Nice, France). The cell line was 
maintained in Dulbecco's modified Eagle's medium (Life Tech- 
nologies) supplemented with 1 0 % fetal bovine serum (Gemini Bio- 
Products), 100 U/ml penicillin and 100 M-g/^nl streptomycin 
(HyClone). MCF-7 and MDA-MB-468 human breast carcinoma 
ceUs [ATCC cell lines obtained from Dr. Adrian Lee, University of 
Pittsburgh] were cultured in DMEM Glutamax media (Life 
Technologies) supplemented with 10% FBS (Gemini Bio-Prod- 
ucts), 100 U/ml penicillin and 100 |ig/ml streptomycin (Hy- 
Clone). MCF-lOA human breast cells [ATCC cell line obtained 
from Dr. Adrian Lee] were cultured in DMEM F12 media (Life 
Technologies) supplemented with 5% Horse Serum (Life Tech- 
nologies), 10 (tg/ml Insulin (Sigma- Aldrich), 20 ng/ml Epidermal 
Growth Factor (Sigma- Aldrich), 20 ng/ ml Cholera Toxin (Sigma- 
Aldrich), 500 ng/ml Hydrocortisone (Sigma- Aldrich), and 1% 
Penicillin/Streptomycin (Life Technologies). All ceU lines were 
maintained in humidified incubators at 37°C with 5% COj. 

Stimulation and inhibition of STATS phosphorylation 

Cal33 cells were plated in coUagen-coated 384-weU plates 
(Greiner Bio-One) at 2000 cells/well to reach 50% confluence on 
the day of fixation. The cells were incubated at 37°C for 24 hours 
followed by serum deprivation for another 24 hours. For 
stimulation of STATS phosphorylation, human recombinant 
interleukin-6 (IL-6) and Oncostatin M (OSM) (R&D Systems) 
were added in 2-fold or 2.4-fold serial dilution for 10 final 
concentrations descending from 200 ng/ml or 50 ng/ml, respec- 
tively. For the time course of cellular response to stimulation, the 
cells were incubated with cytokines at 37°C for 15, 30, 45, 60 and 
120 minutes before fixation. For inhibition of STATS phosphor- 
ylation, Pyridone-6 (Calbiochem) was added in 10-point 3-fold 
serial dilution for final concentrations descending from 5 [iM. 
Stattic (Sigma- Aldrich) was added in 10-point 3-fold serial dilution 
for final concentrations descending from 50 [xM. After 3 hours 
incubation with the inhibitors at 37°C, cells were stimulated with 
50 ng/ml of IL-6 for 15 minutes (peak induction time) before 
fixation. Each treatment was performed in triplicate. Each 
experiment was repeated at least 3 times. The assays were 
optimized for cell density, cytokine dose and treatment time with 
their robustness validated using the Z' factor [27]. Each plate 
included 16 positive and 16 negative control wells from which the 
Z' was calculated (ranged from 0.5 to 0.8). 

Immunofluorescence labeling 

Cells were treated with cytokines with or without incubation 
with inhibitors, then fixed in plates for 30 minutes at room 
temperature with 3.7% Formaldehyde in PBS, 2 |J,g/ml Hoechst 
3 3 342 (Life Technologies) for nuclear staining, and then 
permeabilized on ice with 95% MeOH in PBS for 30 minutes. 
PermeabHized cells were washed and incubated with 0.1% Tween- 
20 in PBS at 4°C overnight. Cells were labeled for 1 hour with a 
1:800 dilution of mouse anti-pY705 STAT3 antibody (BD 
Biosciences) followed by a 1 hour incubation with a 1:300 dilution 
of the Alexa Fluor 647-donkey anti-mouse AffiniPure secondary 
antibody (Jackson ImmunoResearch) prior to high content 
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imaging. The fixation procedure and antibody titrations were 
optimized individually. 

High Content Analysis 

Labeled cells were imaged with an ArrayScan VTI (Thermo- 
Fisher -CeUomics) using a lOX (0.45NA) objective, a stable LED 
illumination source with excitations of 386/23 nm and 650/ 
1 3 nm, and a multiband emission filter with transmission at 440/ 
40 nm and 700/60 nm for Hoechst (Chi) and AlexaFluor 647 
(Ch3), respectively. Image correction for non-uniformity of the 
field of the VTI was accomplished using an Opera Adjustment 
Plate (PerkinElmer) which contains uniform dye solutions as 
targets for reference image collection. Images were corrected using 
the VTI acquisition software and analyzed using the Compart- 
mental Analysis Bioapplication (Thermo Fisher- CeUomics). 
Briefly, the Hoechst nuclear images were segmented using an 
isodata threshold. The DNA content was measured as the 
Bioapplication feature ObjectTotallntenChl for the selected 
nuclear region ('circ'). The 'circ' regions were copied to channel 
3 where it was used to construct a 'ring' starting 1 pixel out from 
the 'circ' and 3 pixels wide. The STAT3 phosphorylation was 
measured using the Bioapplication feature CircRingAvglnten- 
DiflChS. Images of four fields (t)pically ranges from 2000 to 2500 
cells total) were acquired for each well. The pixel average nucleus- 
cytoplasm difference (CircRingAvglntenDiflChS) in pSTATS 
fluorescence intensity (referred as "Relative Activity" in the plots) 
was calctjlated for each cell in all four fields. Cell level data were 
retrieved from the CeUomics Store Database (ThermoFisher) into 
Spotfire (TIBCO) using SQL queries and plotted for dose and 
time dependent responses. Experimental metadata were merged 
into Spotfire using an Excel (Microsoft) template and the 
combined data was analyzed to generate a 'histo-box plot' as 
described below. 

Flow Cytometry 

Cell and Bead Standards: Flow cytometry standards were used 
to establish the resolution and linearity of the high content imaging 
relative to that of a flow cytometer. In this study we used Becton 
Dickinson 2 nm beads (DNA QC Particles 349523, Becton 
Dickinson). 

Flow Cytometry: The LSR II (Becton Dickinson) was config- 
ured witii UV-355 nm, Violet-404 nm, Blue-488 nm and red-633 
excitations. The pulse emission areas were collected with the 
following filters: Hoechst 33342^50/50 nm. Fluorescein - 530/ 
30 nm, Cy3 - 610/20 nm, Cy5 - 660/20 nm and Cy7 - 780/ 
60 nm. To establish that high content imaging was capable of 
acquiring accurate DNA histograms, Cal33 cells were run on both 
the LSR II flow cytometer and the ArrayScan VTI High Content 
imager (ThermoFisher). The cells were trypsinized, fixed, stained 
with Hoechst 33342 (Life Technologies) at 2 ng/ml. The flow 
cytometry sample was run as a suspension of 2x10^ cells/ml at a 
rate of ~ 100 cells/s(;('. The data were then analyzed \vith Flowjo 
7.6.5 (TreeStar) or exported as FCS 3 format for cell c:ycle analysis 
in ModFit (Verity Software House). For high content imaging, the 
Cal33 cell suspension was spun down on a 96 well microplate. The 
data were exported to text files that were converted to FCS format 
with Text to FCS Software 1.2.1 and then analyzed identically to 
the flow cytometry data. 

Visualization of subpopulation distributions 

A modification of a standard 'box plot' was generated in 
Spotfire to visualize the distributions of cellular responses and 
identify heterogeneity in those cell populations. The 'box' in a box 
plot represents the extent of the central 50% of the population but 



gives no indication of the distribution of those values. Spotfire 
includes an option in the 'Appearance' settings for the box plot to 
overlay the distribution (histogram) which we used to create the 
visualization we refer to as a 'histo-box plot'. Similar plots, such as 
violin plots [46] and bean plots [47], can be created in R [48], 
Madab (Math Works) and other applications. The histo-box plot 
was used to analyze experimental data from the High Content 
Screening assays. Parameters characterizing the distribution such 
as the interquartile range (IQR), lower and upper inner fences and 
percent outiiers were calculated. Log scaling of histo-box plots was 
done by applying the 'LoglO' function in Spotfire- to the a\crage 
CircRingAvgIntenDifiCh3 values and is referred as 'Log (Relative 
Activity)' in the plots. 

Data analysis 

Statistical measures used for heterogeneity analysis: 

Coefficient of variation (CV) 7„ (1) 



Interquartile Range (IQR) gs - Qi (2) 



N 

Quadratic Entropy (QE) QE= '^y^Pi^Pj (3) 

,>7 = 1 



Kohnogorov— Smirnov distance (KS) 

KS = max | CDF^a, - CDF,,f\ ^"^^ 

where a is the standard deviation (SD), |j, is the mean, Qj, is the n- 
th quartUe, A'^ is the number of data values, is a linear matrix of 
intensity differences between data points i and j, p is the 
probabilit)' distribution of data points, and CDF,i,^t and CDF,,,f 
are the cumulative distribution functions of the data and a 
reference distribution, respectively. Quadratic entropy was calcu- 
lated as a summation over 64 equally spaced bins spanning the 
range from -34 to 1441 to minimize the finite size effects associated 
with the binning scheme. Scalar statistic KS values were computed 
for each sample distribution using the MATLAB (MathWorks) 
function 'kstestO' and a reference normal distribution generated as 
a Gaussian distribution: 

,(x|.,.)^-^exp{-i^^}. (5) 

with the mean fi and standard deviation a of the measured sample 
distribution and g is the probability density about x. 

Results 

Characterization of heterogeneity in HCS 

We optimized the assay and performed an HCS screen to 
measure the activation of the STATS signaling pathway in 
response to interleukin 6 (IL-6) and/ or Oncostatin M (OSM) using 
an antibody against phospho-STAT3-Y705 [49]. The assay was 
optimized and validated for timing and duration of activation, as 
well as robustness in several cancer and normal cell lines (Figure 
SI). The induction of STATS by IL-6 in Cal3S cells exhibited a 
high level of cell-to-cell variation even at the optimal exposure 
time of 15 minutes and a dose that produced maximal activation 
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(&50 ng/ml IL-6). The variability in tlie fluorescence intensity of 
tlie nuclear localized, activated (phosphorylated) STAT3 was 
easily observed in the images (Figure lA). Despite this cell to cell 
heterogeneity within each well, the assay was highly reproducible 
by standard criteria (a Z'aO.5, and a signal-to-backgTound >5) 
and exhibited a typical dose-response (Figure IB). However, the 
reproducibility indicated in Figure 1 B is a measure of the well-to- 
weU reproducibility and does not give any indication of the cell-to- 
cell variability demonstrated by the large error bars (± 1 SD) 
within a well (Figure IC). Application of standard assay 
performance criteria like to the characterization of cell-to-cell 
reproducibility would result in a highly negative Z', indicating high 
cell-to-cell variability, as we observed in Figure lA and IC, but no 
insight into the nature of the cellular heterogeneity. Clearly a 
different approach to characterizing cellular heterogeneity is 
needed as a complement to determining the Z' and S/B for an 
assay. 

To determine whether the high degree of variation in the level 
of STAT3 activation was unique to the Cal33 cell line, and/or 
activation by IL-6, we validated the assay on a panel of 5 cell lines 
and then compared the activation of STAT3 by 2 different 
cytokines, IL-6 and OSM. Figure 2 shows example distributions of 
STAT3 activation by 5 doses of IL-6 or OSM in the 5 cell lines 
(data provided as DataArchive SI). To visualize the distribution 
and statistical parameters of the cellular responses we used a 
standard box plot with an overlaid histogram that we refer to as a 
'histo-box plot' (see Figure S2 for more details). The interquartile 
range (IQR, the 'box') extends from the T' quartUe to the 3"' 
quartUe and comprises 50% of the data. The histogram extends 
from the lower inner fence (LIF) to the upper inner fence (UIF). 
Outiiers, points outside the range from LIF to UIF (indicated as 
individual points on the plot), as well as reference indicators of the 
average (white line) and the 1 0* and 90''^ percentiles (black dashed 
lines) are presented in the plot. Note that although a standard box 
plot indicates the median of the distribution, here for reference we 
show the more commonly used assay parameter, the average 
value. Clearly the distributions of the activation of STATS vary 
widely between these cell types, and cytokines. 

As illustrated in Figure 2, Cal33 cells exhibit a bimodal 
distribution in response to IL-6 and a more "normal" distribution 



(though with more outliers) in response to OSM, demonstrating 
that different signaling ligands can induce distinct patterns of 
heterogeneity in the same cell type. In contrast, 686LN and MCF- 
lOA cells exhibit a similar distribution of responses to both IL-6 
and OSM. The 686LN cells were ~4-fold more sensitive to IL-6 
and «2-fold less sensitive to OSM than either Cal33 or MCF-lOA 
cells. The breast cancer cell lines, MCF-7 and MDA-MB-468, 
showed little or no response to IL-6, with an apparent dose- 
dependent increase in outiiers, indicating that a small subset of the 
cells respond to IL-6 to the same extent as Cal33 or 686LN cells 
and with the same sensitivity as Cal33 cells. The breast cancer cells 
exhibit very different responses to OSM. The MDA-MB-468 cells 
response to OSM is similar to the MCF-lOA cells, while the 
majority of MCF-7 cells respond over a range from seemingly non- 
responsive to low level responders with outliers exhibiting a high 
level response comparable to MCF-lOA and MDA-MB-468. In 
summary, we observed differences in distributions between cell 
lines and between cytokines that varied from narrow to broad, 
bimodal vs. normal and variation in the number of outliers. The 
presence of macro-heterogeneity, micro-heterogeneity and a 
variable % of outliers were evident in these data. 

To ensure that the observed heterogeneity was not due to 
instrumental or measurement variability (instrument systems 
response), we compared HCA measurements of standard fluores- 
cent beads and DNA content in Cal33 cells with measurements of 
the same samples by flow cytometry (Table SI, Figure S3). In both 
cases the imaging CV was 2-3% higher than the flow cytometry 
CV, as expected, but stiU only about 5% for beads and about 8% 
for DNA content, well below the CV of ~ 50% in Figure IC. 
Therefore, instrumental systems response is not an explanation for 
the heterogeneity. 

In this study, simple biological explanations for the heteroge- 
neity observed in Figure 2 could include a dependence on cell 
cycle and/or expression level of the IL-6 receptor. We investigated 
the potential cell cycle dependence of the activation of STAT3 and 
found there was no correlation between cell cycle phase and 
STAT3 activation by IL-6 (Figure S4). Western blot analysis of the 
expression levels of IL-6 in the cell lines indicates that 
responsiveness is not directiy correlated with total receptor 
expression (Figure S5). Determination of the molecular basis of 




Figure 1 . Heterogeneity in the activation STATS in Cal33 cells. Cal33 cells were treated with IL-6 (50 ng/ml) for 15 min. then fixed and labeled 
with an antibody to phospho-STAT3-Y705. A) Pseudocolor image of STAT3 activation shows a high degree of heterogeneity in the intensity of the 
Cy5-labeled secondary antibody (color scale at lower right indicates mapping of relative fluorescent intensities to colors). Scale bar is 100 um (lower 
left). B) The standard deviation of the well average STAT3 activity in replicate wells (EC50 = 3.3 ng/ml, error bars are ±la, N = 8) indicates the assay is 
highly reproducible despite the observed cellular heterogeneity (Z' = 0.54) C) The standard deviation of the cellular STAT3 activity (error bars are ±1 a) 
indicates the high variability in the cell-to-cell STATS Activity consistent with the appearance of the image (A). 
doi:1 0.1 371/journal.pone.01 02678.g001 
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Figure 2. Variation in the cellular distributions of STAT3 activation by IL-6 and OSM in several cell types. Top series) l-iisto-box plots of 
the activation of STAT3 by IL-6 after 15 min exposure to IL-6 at the indicated concentrations in 2 HNSCC cell lines, 1 breast cell line and 2 breast 
cancer cell lines. Bottom series) The activation of STATS by OSM was measured at 15 min. in the same 5 cell lines as above. Note: 686LN cells were 
found to be much more sensitive to IL-6 and much less sensitive to OSM than the other cell lines, so the concentrations were adjusted appropriately. 
doi:1 0.1 371/journal.pone.01 02678.g002 



the heterogeneity is an important cliallenge and is being pursued 
with a range of experimental approaches, including live cell, 
kinetic studies of STAT3 activation, but is beyond the scope of this 
investigation. 

Because the Cal33 cells exhibited a bimodal distribution in 
response to IL-6 and a more normal distribution in response to 
OSM, we decided to examine those dose-responses in more detail 
(Figure 3, DataArchive SI). STAT3 activation by IL-6 in Cal33 
cells exhibited a bimodal distribution, with a a 1 0 % apparentiy 
non-responding subpopulation at all concentrations tested 
(Figure 3 A and 3B). For comparison of distributions with different 
means, or potentially log-normal distributions [50], we also used a 
log-scaled histo-box plot (see Figure S2). Linear scale plots 
(Figure 3A & 3C) allow visualization of the intensity range, 
separation and size of the subpopulations. When plotted on a log 
scale (Figure 3B & 3D), the apparent width of a distribution is 
proportional to the linear CV, independent of the mean intensity, 
allowing direct comparison of the subpopulation CVs (micro- 
heterogeneity). The linear scaled histo-box plot exhibits a narrow 
distribution of cells at the unstimulated level of STAT3 intensity 
(highlighted in blue), with only a few oudiers of activated cells (< 
2%). Upon activation with IL-6, there remains a distinct and 
persistent subpopulation (—10% at the saturation level of 
stimulation- 1 00 ng/ ml) of apparently non-responding cells (high- 
hghted in blue in Figure 3) and a heterogeneous population of 



responding cells. The log scaling (Figure 3B) showed that the CV 
of the distribution of the unstimulated cell population is about the 
same as that of the stimulated population (58% and 50% 
respectively). The presence of the 2 sub-populations, responding 
and apparently non-responding is an example of macro-hetero- 
geneity. Macro-heterogeneity is distinguished from outliers in that 
oudiers are singular observations that are well separated from the 
median, above the UIF (median -Hl.5*IPR) or below the LIF 
(median - 1.5*IPR), and of insufficient density to be detected 
above the background in a histogram. The limits of detecting a 
subpopulation will depend on the number and distribution of cells 
and may include subpopulation distributions with significant 
overlap. Multiplexed assays, looking at multiple parameters wiU 
assist in further defining the sub-populations [31]. 

Cal33 cells exhibited a different distribution of STAT3 
activation in response to OSM. At the OSM saturation 
concentration, STAT3 activation was observed in essentially all 
the cells (Figure 3). The bimodal distribution was only apparent at 
intermediate OSM concentrations (3.62 ng/ml and 8.68 ng/ml) 
suggesting two subpopulations of cells, one activated at a lower 
dose of OSM, while the other responding at a higher dose, in 
contrast to the persistence of the IL-6 resistant subpopulation even 
at high doses of IL-6 (Figure 3C and D). Overall, OSM stimulation 
of Cal33 cells resulted in a higher degree of activation of STAT3 
than stimulation with IL-6. To evaluate whether the cells that do 
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Figure 3. Visual analysis of phenotypic heterogeneity using the histo-box plot. Population distributions of STAT3 activity in Cal33 cells at 
the peak induction time of 15 min. Relative Activity refers to the cell average nuclear intensity of the labeled pSTATB. Cells with intensity below the 
90* percentile of the untreated cells (left most histogram) are colored in blue to highlight the apparently non-responsive subpopulations. "Count" 
indicates the total number of cells measured. A) Linear-scaled dose-response distributions of STAT3 activity at the indicated concentrations of IL-6 
show a persistent subpopulation of cells with a distribution comparable to the unstimulated cells. The well average EC50 = 3.3 ng/ml. B) Log-scaling 
of the same distributions in A shows that the CV of the responding cells (far right) is similar to the unstimulated cells (far left). C) Linear-scaled 
population distributions of STAT3 activation by OSM at the indicated concentrations also show a non-responding subpopulation at 8.68 ng/ml, but 
unlike IL-6 there are only a few outliers that are apparently non-responsive at 50 ng/ml and the responding cells appear to be more normally 
distributed. D) Log-scaling of the same distributions in C shows that the CV of the responding cells (far right) is similar to the unstimulated cells (far 
left). 

doi:l 0.1 371 /journal.pone.01 02678.g003 



not appear to respond to IL-6 stimulation were in fact capable of 
phosphorylating STATS, we induced cells with combinations of 
IL-6 and OSM (Figure S6). Even at the lowest concentration of 
OSM tested (2.78 ng/ml) cells co-induced with any concentration 
of IL-6 did not exhibit a non-responding subpopulation. Further- 
more, at this low concentration of OSM (2.78 ng/ml), OSM alone 
did not induce a STATS response, suggesting there may be some 
cooperativity between IL-6 and OSM activation of STATS. This 
will be investigated in more detail in a future study. 

Although the analysis of hett-rogeneity described above using 
the histo-box plot provides insights into functional responses, such 
interactive analysis is limited to relatively small numbers of 
experiments where throughput is not critical. For large scale 
profiling and particularly in compound screening, a method to 
identify and quantify heterogeneity is needed in order to compare 
large numbers of compounds, targets, or assays and to make 
decisions about next steps, efficiently. 

Selection and evaluation of the heterogeneity indices 
(HI'S) 

Based on the results described above we selected properties for 
characterization of the distributions that corresponded to the 
features that were identified as variable in the histo-box plots and 
descriptors that can be interpreted in a biologically meaningful 
way. Figure 4 defines the three selected descriptors of the 
distributions and the indices selected to quantify those features, 
cell diversity (DIV), non-normality (nNRM) and percent outliers 
(%OL). We compared the performance of several metrics for 
calculating DIV and nNRM, using model distributions and cell 
data, and found Q_E and KS gave the most consistent and robust 
results (Figure 87). The indices can be used for relative comparison 
or threshold values can be established for classification of samples. 
In this assay the CalSS negative control wells (no IL-6) showed 
only a small percentage of cells with activated STATS while the 
majority were narrowly distributed (Figure S). We used these 
'homogeneous' negative control wells to establish a threshold value 
for DIV and nNRM, equal to the mean -l-3*SD of the well-to-well 
values of the index. For %OL the threshold was selected based on 
a normal distribution. The normal distribution has a UIF-LIF 
range of 4 SD (mean ± 2 SD) that contains 95.5% of the 
population and therefore the expected %OL is 4.5%. Therefore 
the threshold of >4.5% indicates more oudiers than would be 
expected if the distribution were normal. Figure 5 shows the 
application of the selected threshold values for classification of 
wells as heterogeneous. To examine the suitability of the candidate 
heterogeneity indices and thresholds we used the data sets for IL-6 
and OSM induction in the five cell lines (Figure 2). The results are 
presented as bar graphs of the three parameters (Figure 6). When 
stimulated with IL-6 (left panel), CalSS cells exhibit a gradual 
increase in DIV, a consistendy high nNRM and a decrease in 
%OL. On the other hand, OSM (right panel) has littie effect on 
the Hi's below 8.6 ng/ml but induces a nearly 2-fold greater 
increase in DIV, with essentially no change in nNRM or %OL, 



consistent with the distributions in Figure S. The other cell lines 
exhibit different patterns of response to IL-6 and OSM. 686LN 
and MCF-lOA cells respond essentially the same to IL-6 and 
OSM. MCF-7 and MDA-MB-468 cells respond to IL-6 with an 
increase in nNRM and %OL, but no increase in DIV, while OSM 
induces a significant increase in DIV, with a small increase in 
nNRM in the MCF-7 cells. It is interesting that the pattern of 
heterogeneity induced by OSM in MCF-7 cells is very similar to 
that induced by IL-6 in CalSS cells. In nearly all cases OSM 
induces a more normally distributed response, which is stiU 
heterogeneous, while the response to IL-6 is much more variable. 

The interpretation of the three Hi's is accomplished by applying 
a binary decision tree (Figure 5) that classifies a population 
distribution as "homogeneous", "homogeneous with outiiers", 
micro-heterogeneity, micro-heterogeneity with outliers, macro- 
heterogeneity, or macro-heterogeneity with outliers. We use 
"homogeneous" as a relative term, since cell populations always 
exhibit some heterogeneity. 

We evaluated the effect of two known inhibitors of STATS 
activation on the IL-6 stimulated distributions in CalSS cells. 
Pyridone-6 is a pan-Janus-activated-kinase (Jak) inhibitor [51] and 
Stattic is reported to interact with the SH2 domain of STATS, 
inhibiting dimc'rization and nuclear translocation [52]. Both 
compounds inhibited IL-6 induced STATS activation with 
IC50s of 0.066 |iM and 10 nM, respectively (Figure 7 and Figure 
88, DataArchive 82). Figure 7A and 7C display log-scaled histo- 
box plots of pyridone-6 and Stattic inhibition of CalSS cells, 
respectively. The IC50s are shown as dashed lines. Pyridone-6 
treated samples appear to have an increasing fraction of inhibited 
cells starting at the lowest concentration and a stable population of 
STATS activated i:ells up to about 0.1 |J,M, resulting in a 
broadening of the distribution. These trends are reflected in the 
Hi's (Figure 7B). For pyridone-6 the DIV index is above threshold 
up to the IC50, but the increase in the nNRM index indicates the 
presence of differentially responding sub-populations of cells, 
macro-heterogeneity with outiiers. Above 1 [iM the cclh are 
essentially aU inhibited, except for a few oudiers, some of which 
appear to be STATS activated cells. 

Stattic, in contrast, displayed a stable population distribution 
with no evidence of inhibition until the concentration reaches the 
IC50 at which point there is a very steep inhibitory effect. Stattic 
showed essentially no change in Hi's up to the IC50 indicating 
that nearly all the cells have essentially the same sensitivity to this 
compound, and therefore respond at the same dose level 
(Figure 7D). Thus, while pyridone-6 has a more potent IC50, 
Stattic has a more uniform effect as a modulator of STATS 
activation. In both cases the compounds show an increase in % OL 
above the IC50, exceeding the selected threshold of 4.5%, and 
persisting even at the highest dose tested. This indicates that 
neither compound is entirely effective at inhibiting the activation 
of STATS. 

An important consideration in the application of the Hi's is the 
sample size requirements. To address this we performed power 
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Characteristic 



Cell Diversity (Div) 
Increase in random 
variation in the cellular 
response 



Example 




Index 



Quadratic Entropy (QE) 
"Information Content" with 
"Difference matrix" 



non-Normality (nNrm) 
Subpopulations of 
cells responding 
differently 



%Outliers(%OL)-A 
few cells that respond 
very differently 




Kolmoqorov-Smirnov (KS) 
Difference from reference 
Normal distribution 



Percent Outliers (%0L) 
Cells outside the range of 
LIF-UIF (~>2ofrom mean) 



Figure 4. Three indices for characterizing cellular heterogeneity. Three indices that provide information about the distribution were chosen. 
Cell Diversity (DIV) characterizes the overall heterogeneity in the population without regard for the specific shape of the distribution, using the 
Quadratic Entropy, a metric that is sensitive to the spread of the distribution as well as the magnitude of the differences between phenotypes in the 
distribution. Non-Normality (nNRM) indicates deviation from a normal distribution, distinguishing between micro- and macro-heterogeneity. 
%Outliers (%0L) indicates the fraction of cells that respond differently than the majority. 
doi:1 0.1 371 /journal.pone.01 02678.g004 



analysis for the HFs (Table S2). For the DIV and nNRM indices in 
this assay, to achieve a power level of 0.8 requires —900 and 
~ 1 1 00 cells respectively. This number of cells is easily achievable 
in standard assay formats such as 384 well microplates. In this 
assay, which was implemented in the 384 well format, 1000 cells/ 
well represents about 4 fields/well at lOx. 

Discussion 

The significance of heterogeneity in the development of 
therapeutics and diagnostics 

Heterogeneity is a characteristic of cellular populations that is 
fundamental to biological processes including development, 
dilferentiation, and immune-mediated responses [9]. Analysis of 
heterogeneity is expected to be useful in a wide range of biological 
applications including the differentiation of stem cells and the 
development of assays in difiFerentiated neuronal cells, where we 
would expect to find significant heterogeneity. Certainly, in the 
context of this cancer example, heterogeneity in the response to a 
potential therapy is "bad", however, in other applications 
heterogeneity analysis may be essential to characterizing the 
response of a subpopulation of interest, or even as a primary 
readout for screening. When heterogeneity is associated with 
dysregulated genetic -based and/or non-genetic-based functions, it 
can play a critical role in the progression of complex diseases such 
as cancer [53], where intra-tumor heterogeneity poses a formida- 



ble challenge to the development of therapeutics [15,54] as well as 
diagnostics [15,25,33]. Thus identifying, quantifying and charac- 
terizing heterogeneity in patient samples and disease relevant 
models for drug discovery using validated cell-by-cell analysis 
methods [15,33,53,55] represents an important unmet need. To 
address this need we have defined and developed heterogeneity 
indices (HFs) (Figure 4) that enable the full potential of HCA and 
other cell-by-cell analytical methods. As a specific example, we 
applied these indices to identify, quantify, and characterize 
intrinsic heterogeneity in the activation of STAT3 in response to 
two cytokines and small molecule perturbagens. Based upon these 
results, we recommend a new paradigm for the application of 
these or similar HFs to the discovery of small molecule probes and 
therapeutics. Heterogeneity in the response to such probes may 
have important implications for understanding fundamental 
mechanisms of biological regulation and, as a mainstay in 
personalized medicine, lead to the development of novel 
therapeutic strategies for complex diseases (see below). 

High Content Screening (HCS) was developed as a tool to 
automatically acquire, process, store, analyze and view large 
amounts of cellular data, creating an efficient platform for cell-by- 
ceU analysis [22,56-59]. However, the traditional focus in drug 
discovery on high throughput screening encouraged most 
researchers to focus on well average assays as a standard. This 
approach increased the throughput of screening but sacrificed the 
information on heterogeneity in the population [30]. As a result, 
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Figure 5. Decision tree for interpreting tlie Heterogeneity Indices. Using thresholds established for each index, DIV, nNRM and %0L, a binary 
decision tree can be used to characterize heterogeneity in a given sample. The thresholds for DIV (0.03) and nNRM (0.05) were selected as the mean + 
3 SD for each index in replicate negative control wells for Cal33 cells. The threshold for %0L (4.5%) is the percent outliers expected for a normal 
distribution. 

doi:1 0.1 371 /journal.pone.01 02678.g005 



although heterogeneity is widely recognized as a fundamental 
characteristic of biological systems, relatively little is known about 
the nature of heterogeneity in the cellular or tissue response to 
current pharmaceuticals. 

Although a well average assay may exliibit a very good Z', and 
therefore a high degree of reproducibility [27,28], the ceU-to-ceU 
heterogeneity within a well can be significant (Fig 1, 2, 3, 6, 7). In 
developing new drugs it is not sufficient to modulate the "average" 
cell if heterogeneity exists, particularly for cancer therapeutics. 
Thus we aimed to identify a simple set of metrics, the Hi's, that 
could be automatically calculated and reported along with the 
standard well-level read-outs of mean and SD, and the well-to- 
well, plate-to-plate and day-to-day metrics of Z', S/B, and CV, to 
rapidly determine if heterogeneity exists and to quantify the extent 
of the heterogeneity (Figures 4, 5, 6, 7). 

As HCA is utilized more extensively to quantitate cellular 
heterogeneity, there must be a focus on the development of quality 
control standards and practices such as those that have been 
successfully implemented in flow cytometry. To distinguish the 
"system" variability (includes sample preparation, instrument 
response and algorithm performance) from the variability in 
biological responses (i.e. intrinsic biological heterogeneity) we used 
fluorescent calibration beads with a narrow and well characterized 
distribution (Figure S3 and Table SI). These highly uniform beads 
established that the minimum limit of variability in measurements 
of intensity in this assay, CV of 5-8%, was well below the observed 
heterogeneity in STATS activation. Minimizing the "system" 



variability is critical to performing quantitative fluorescence 
imaging and has been discussed in detail [60-62] . 

Selecting heterogeneity indices (Hi's) to apply to single 
cell analyses 

We considered three properties of the distribution of data that 
are significant in the biological interpretation of heterogeneity and 
selected HFs to describe each: 1) How variable is the response? 2) 
Is there more than one type of response? and 3) Are there outiier 
cells that respond differently? We evaluated several statistical 
measures of distribution width (diversity) including the IQR, 
Shannon Entropy (SE) [63], Differential Entropy(DE) and 
Quadratic Entropy (QE) (see Figure S7). We used the QE, which 
has been shown to provide a quantitative measure of species 
diversity and incorporates information not only on the number of 
different species in a population, but also on the magnitude of the 
differences between biological species [64,65]. The QE has also 
been shown to be useful in quantitation of the diversity of cellular 
phenotypes in cancer tissue sections for diagnostic application 
[33], and we have extended that use to the characterization of 
cellular diversity (DIV) in HCA assays. 

To further characterize the population responses with respect to 
the presence of subpopulations (i.e., discrete phenotypic cell states) 
we adopted the definition of macro- and micro-heterogeneity 
proposed by Huang [9]. Macro-heterogeneity refers to the 
variability in one or more cell traits that results in discrete 
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Figure 6. Comparison of the activation of STAT3 across 5 cell lines. Application of the Hi's to the data in Figure 2. Left Panel) Activation of 
pSTAT3 by exposure to lL-6 for 15 min at the indicated concentrations. Right Panel) Activation of pSTATS by exposure to Oncostatin M for 15 min at 
the indicated concentrations. Red Bars) Diversity index (DIV) indicating the relative heterogeneity associated with the activation of pSTATS. The 
horizontal red line indicates the selected threshold for classifying populations a heterogeneous. Green Bars) The non-Normality index (nNRM) 
indicating the extent of deviation from a single, normally distributed population. The green horizontal line indicates the selected threshold for 
classifying a population as having macro-heterogeneity. Blue Bars) The percent outliers (%0L) indicates the percentage of cells with an activity level 
that is above the upper inner fence or below the lower inner fence. The horizontal blue line indicates the selected threshold that is used to classify a 
population as having more than the expected number of outliers. 
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phenotypes and a multimodal distribution. Examples of macro- 
heterogeneity include the distinct states of progenitor vs. differen- 
tiated cells, the phases of the cell cycle and the time dependent 
changes in the intracellular distribution of proteins such as 
transcription factors. Micro-heterogeneity is defined as tlie 
apparently continuous random variation in a single phenotype, 
leading to a normal (or log-normal) distribution of the cell trait. 
Examples include population noise, such as the prolonged 
expression level of a protein during development, and temporal 
noise based on stocliastic fluctuations of a cell trait within a single 
cell over time that are not usually synchronized between cells in 
the population. [9]. Based on these definitions, the distinction 
between micro- and macro-heterogeneity is equivalent to a 
normality test. We evaluated several potential measures of 
distribution shape including skewness, kurtosis, mean-median 
ratio and the Kolmogorov-Smirnov (KS) test relative to a normal 
distribution with the same mean and SD, also known as the 
LUliefors test [66] (see Figure S7). The KS test is an established 
measure of normality [67,68] and the use of KS analysis is well 
known in HCA [30,35,36,69-71]. The deviation of the distribu- 
tion from micro- to macro-heterogeneity, results in an increase in 
the KS statistic, which we use as a non-normality index (nNRM), 
indicating that there may be more than one mechanism of 



response or that the cells may be in more than one state and 
therefore should be further evaluated. 

The third index of heterogeneity quantifies the percentage of 
outliers. The presence of outlier cells that respond distinctly from 
the majority is usually completely ignored in HCA. These outliers 
may be critically important in the development of therapeutics, 
especially in cancer, where a small number of resistant sub-clones 
may exist prior to treatment, then undergo positive selection, 
resulting in only a transient beneficial response and consequently 
result in high rates of relapse [72]. The percent outlier index 
(%OL) was chosen based on the standard statistical definition of 
oudiers used in box plots: samples outside the range from the lower 
inner fence to the upper inner fence. Other choices of outiier 
definitions could also be applied, but this particular definition is 
consistent with our choice of the histo-box plot for reviewing 
heterogeneity. The biological interpretation of outliers is chal- 
lenging due to the relatively small numbers, but requires further 
evaluation when detected. 

The combination of these three heterogeneity indices (HTs) can 
be used to classify the heterogeneity in a cell population using a 
binary decision tree as shown in Figure 5. The criteria for selecting 
classification threshold values will vary depending on the project. 
For example, in the IL-6 activation assay, the negative control 



PLOS ONE I www.plosone.org 



10 



July 2014 I Volume 9 | Issue 7 | e102678 



Heterogeneity in HCS: Application to Drug Discovery 



A 



> 
< 

CO 



00 

o 



3.5 
3.0 

2.5 

2.0 

1.5 

1.0 



Pyridone-6 

IC50 = 66nM 



0.00025 0.00076 0.0023 0.0069 0.021 0.062 



0.19 0.56 1.7 



B 



Pyridone 




1:1 



.;i I. i-"i.t 



20 sP 

OS 



10 ro 



0.000... 0.000... 0.0023 0.0069 0.021 0.0621 0.19 0.56 1.7 

[Compound] (|iM) 



STATTIC 



IC50 = lOuM 

V 



t; 3.5 



S.0 

u 

< 2.5 
CO 



00 

o 




0.0025 0.0076 0.023 0.069 0.21 0.62 1.9 5.6 



17 50 



D 



StattiC 



0.3 



0,2 



0.0 

00 



I 



..4i|4 



0.0025 0.0076 0.023 0.069 0.21 0.62 1.9 5.6 



. I 



17 50 



20 Q 
C 

m 

10 "5 



[Compound] (|iM) 



Figure 7. Heterogeneity in the response to inhibitors of STATS activation. Cal33 cells were exposed to Pyridone-6 (A&B) or STATTIC (C&D) 
at the indicated concentrations for 3 hours prior to stimulation with 50 ng/ml of IL-6. A) Inhibition by Pyridone-6. Log scaled distributions are plotted 
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inhibition. The vertical dashed lines indicate the IC50 for the compounds calculated from the well-averaged signal intensities. 
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wells were nearly 'homogeneous' while in the inhibition assay, 
where all wells contained IL-6, the maximally inhibited wells were 
most nearly 'homogeneous'. We chose to use threshold values that 
were 3 SD above the mean DIV or nNRM in replicate control 
wells as indicating a substantial increase in heterogeneity relative 
to the control. Alternatively, for the nNRM index an absolute 
threshold could be defined based on the critical values for the KS 
test [66,68]. To achieve 99% confidence in the determination of 



non-normality, the KS statistic must be ^1.031/vN, where N is 
the sample size. In this study the minimum sample size was about 
2,000 cells per well which results in a critical KS value of 0.02. We 
used a more conservative threshold of 0.05. For %OL we defined 
an absolute threshold based on the percentage of a normally 
distributed population that would be classified as outiiers (4.5%). 
For screening or other large scale profiling applications, these 
indices can be sorted, clustered or viewed as heat maps to identify 
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cell population profiles that indk:ate more complex biology. For 
visual analysis of the distribution we found the histo-box plot to be 
much more useful than the standard box plot. For multiparameter 
assays, the heterogeneity indices can be evaluated on each 
readout, or a dimension reduction approach such as principal 
component analysis [73,74] can be applied first and the Hi's 
calculated for the principal components. 

Application and impact of heterogeneity analysis in drug 
discovery 

For the reasons stated above it is important to apply 
heterogeneity analysis throughout the early drug discovery process 
from assay design and implementation through secondary screens, 
SAR analysis and into pre-clinical studies (Figure 8). The 
development of disease relevant models and assays begins with 
the analysis of patient samples to identify suitable biomarkers and 
assay readouts, and to estabUsh difierences in the organization and 
heterogeneity profiles of those biomarkers in diseased and normal 
tissues. Physiologically relevant models that recapitulate the 
disease state may require more complex architecture, including 
multiple cell types, which also lead to heterogeneity in assay 
readout(s). The methods proposed here can be applied in both 
cases to characterize and track heterogeneity, and to optimize the 
model. For example, in the Cal33 assay used here, not all of the 
cells responded to IL-6 activation of STATS, whereas all cells 
responded to OSM (Figure 3). Choosing IL-6 stimulation for lead 
identification may limit the screen to selecting compounds that 
have mechanisms present in only a subset of cells (i.e., those that 
are IL-6 dependent), ultimately reducing therapeutic efficacy and 
necessitating combination strategies (see below). A more appro- 
priate assay may be one using OSM and/or a combination of 
cytokines as the inducer, but the choice should optimally be driven 
by an und(;rstanding of the pathway and the role heterogeneity 
plays in the dysregulation of STATS in cancer tissue. 

During implementation of an assay in a screening campaign, 
HFs would be reported alongside the compound potency and 
assay performance statistics (Figure 8 and S9), flagging compound 
concentrations that exceed the established thresholds (Figure 5). A 
DIV that indicates heterogeneity will be an alert that a compound 
induced variable responses within the cell population, and the 
nNRM and %OL will further classify the heterogeneity. Applying 
the decision tree described in Figure 5 to the pyridone-6 data in 
Figures 7 and S9, the DIV index indicates heterogeneity below 
0.56 nM, while the nNRM index shows a concentration 
dependent macro-heterogeneity in the inhibitory response (nNRM 
HI>0.05) indicating sub-populations of cells with different 
sensitivities to pyrid()n(:-6 inhibition. Furthermore, both com- 
pounds show an increase in %OL above the IC50, indicating that 
some cells continue to activate STATS, even in the presence of 
inhibitor. i\lthough representing only a small percentage of the 
cells, resistance to treatment may have important implications in 
cancer therapy. The %OL feature should provide additional 
information for selection and optimization of hits and leads, as well 
as a readout that could be used to screen combinations of 
compounds for improved efficacy with respect to outlier cells. 

In drug development, compounds where macro-heterogeneity is 
ideiitifiecl would need to be further studied, perhaps starting with 
the histo-box plot. Compounds exhibiting heterogeneity present 
two options: (1) deprioritize in favor of compounds that modulate 
the cell population more uniformly; or (2) select the compounds for 
efficacy in a specific sub-population for use in a combination 
therapy strategy. In this study pyridone-6 and Stattic exhibited 
very different, dose dependent heterogeneity profiles, consistent 
with their different reported mechanisms of action. The objective 



of monitoring heterogeneity in secondary assays should be to 
identify potential differences in MOA between lead compounds 
and to more fully characterize the range of cellular responses, 
enabling more informed decisions in selecting compounds to 
advance through drug development. 

Finally, it is important to follow the heterogeneity profile while 
investigating SAR in the lead optimization stage to ensure that 
changes in the compound structure do not introduce additional or 
undesirable heterogeneity in the response, implying additional 
mechanisms of action. Furthermore, the heterogeneity profile can 
be used in combination with biological p()t(;ncy to drive the SAR 
of a lead series towards a non-disease profile. 

Importance of heterogeneity analysis in understanding 
biological regulation and in developing optimized cancer 
therapeutic strategies 

Darwinian-like clonal evolution significantly contributes to the 
genetic heterogeneity within tumors, which contributes to the 
observed phenotypic diversity [53,75,76]. Additional intra-tumoral 
phenotypic diversity results from epigenetic changes [43,53,75] or 
as a consequence of heterotypic signaling within an abnormal 
micro-environment [76,77]. This phenotypic diversity and plas- 
ticity, in conjunction with the complexity of STATS signaling and 
regulation that involves crosstalk with several other pathways (e.g. 
PISK, RAS, NFkB, NOTCH) [78-83], enables smaU molecule 
perturbagens to induce both micro- and macro-heterogeneous 
responses. For example, fluctuations in the expression of signaling 
components can alter the kinetics of a specific step targeted by a 
small molecule, inducing micro-heterogeneity. Alternatively mac- 
ro-heterogeneity, such as evidenced by the presence of apparentiy 
non-responder subpopulations, could result from changes in 
protein expression tiiat result in dysregulation of the crosstalk 
involved in negative feedback or the activation of compensatory 
pathways. In fact these latter two mechanisms are among the most 
common for generating resistance to targeted therapies in cancer 
[72,84—87]. Thus, heterogeneity presents a major challenge to 
optimizing therapeutic regimens, as the targeting of a predominant 
tumor subpopulation often only provides transient benefit and will 
inevitably result in the emergence of resistant populations, and 
relapse [72]. 

Recent studies suggest that knowledge of the tumor composition 
and the response of component subpopulations to single drugs, in 
conjunction with computational and experimental modeling, can 
identify drug combinations that minimize the outgrowth of 
resistant subpopulations in tumors, while enhancing tumor free 
survival in mice [8,88] . Importantly, the experimentally validated 
computational simulations demonstrated that the optimal drug 
combination predicted depended on whether the entire tumor 
population, or only a particular subpopulation, was examined. 
These results further emphasize the need to incorporate intra- 
tumor heterogeneity and the expected evolutionary trajectories 
into rational drug combination design. 

The development of comprehensive, unbiased target based 
mutagenesis and genome-wide gain- and loss-of-function technol- 
ogies that can anticipate clinically relevant resistance, represents 
an alternative, or perhaps complementary approach to modeling 
and therapeutically addressing tumor lu^terogcncity [72,90,91]. It 
is likely that the future of personalized cant:er medicine will involve 
a comprehensive bio informatics analysis of a biopsy that will reveal 
the presence of distinct cell populations [6,89] and consultation of 
an estabhshed drug-genotype database that will allow clinicians to 
computationally determine optimal, patient-specific combination 
therapies. We hypothesize that heterogeneity analysis wiU be 
essential to the implementation of a Q_SP-driven approach that 
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Clinical Tissue 
(e.g. Tumor biopsy) 



Use to define disease relevant model 

Establish heterogeneity profile for disease and normal tissue 



Assay Design and Development 



• Develop disease relevant model including heterogeneity 



Assay Optimization 



Optimize performance parameters S/B and Z' 
Establish "operational" HI thresholds 
Establish hit criteria based on HI and potency 



Assay Implementation 



Select hits based on 
potency and HI profile 



Secondary Assays 



Two Options in Prioritizing Hits 
When There is Heterogeneity: 

1) Prioritize hits that exhibit maximal 
uniformity in response and potency 

2) Prioritize hits that exhibit maximal 
potency for specific sub-populations: 
strategic polypharmacology 



Assess potency and heterogeneity in orthogonal assays 



SAR, Lead Optimization 



Follow SAR of potency and heterogeneity profile 



Pre-clinical studies 



Figure 8. Heterogeneity analysis applied throughout the early drug discovery process. [Heterogeneity analysis is required to guide 
decisions throughout the drug discovery process, beginning with defining disease relevant biology in clinical samples, and establishing benchmarks 
for subsequent analyses. Next disease relevant models, which by necessity will be heterogeneous, are developed and optimized. Heterogeneity is 
characterized in the models, and thresholds for Hi's are established along with potency criteria to select hits. Screening hits are advanced to 
secondary assays based on their potency and HI profile. Heterogeneity of response to compounds will be model dependent, and assessing 
heterogeneity in orthogonal secondary assays will provide insights into understanding the MOA. Monitoring the heterogeneity profile during SAR 
and lead optimization is essential to keeping lead development on target and mechanism of the disease relevant biology. 
doi:10.1371/journal.pone.0102678.g008 



includes the highly coordinated, parallel optimization of comple- 
mentary lead structures, where each structure has a clinically 
relevant resistance profile that is addressed by its counterpart, 
which will lead to polypharmacologic therapies that effectively 
drive the tumor into a state of 'checkmate', thereby providing 
sustainable remissions and higher cure rates. 



Supporting Information 

Figure SI Time course and dose-response of the 
activation of STATS in Cal33 cells. A) Dose-response of IL- 
6 indicated by logistic regression curve fit to well average 
intensities in 3 replicate wells. Error bars (±la of the cell-by-cell 
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intensities) indicate the high variability in the cell-to-cell STATS 
activity B) Dose-response of OSM activation of STATS fit as in A. 
Error bars (±1ct of the cellular intensities) again indicate a high 
degree of heterogeneity. C) Well average STATS activity in CalSS 
cells that were exposed to the indicated concentrations of IL-6 for 
the times indicated by color. Error bars (± 1 a of the S replicate 
wells) indicate the assay is highly reproducible despite cellular 
heterogeneity. D) Same as C) except cells were exposed to the 
indicated concentrations of OSM. Time indicated by colors. 
(TIF) 

Figure S2 Use of Histo-Box plot for visualization of 
distributions. A) Histograms of simulated, normally distributed 
data with the indicated mean and a CV= 30% for all unimodal 
distributions. The 2 bimodal distributions combine the distribu- 
tions with means of 500 and 10 and CV = S0%, equally weighted 
('500-1-10') and with a 90:10 split ('500(90%)-H0(10%)'). Each 
distribution consists of 2000 random numbers. B) The same 
distributions as in A, logarithmically scaled. The key defines the 
reference points labeled on the plot (A). 
(TIF) 

Figure S3 Quantitating the Reproducibility of Intensity 
Measurements by Flow and Imaging Cytometry. A) 

Histogram of the total intensity of 2 |im flow cytometry standard 
beads measured by flow cytometry shows peaks for single beads 
(red, CV= 2.8) and double beads (blue). B) Histogram of the same 
beads centrifuged in the wells of a S84 well microplate and imaged 
to measure total bead intensity also shows peaks for single beads 
(red, CV = 5.2) and double beads (blue). C) The histogram of total 
nuclear intensity in GalSS cells fixed in suspension, labeled with 
Hoechst and measured by flow cytometry. Cell cycle modeling 
identifies S subpopulations, GO/Gl (red), S (hashed), and G2/M 
(blue). D) The histogram and cell cycle modeling of the same cells 
centrifuged in the wells of a S84 well microplate then imaged and 
analyzed for total nuclear intensity. 
(TIF) 

Figure S4 The distributions of STATS activation for cell 
cycle subpopulations. Histograms of STATS activation in 

unstimulated and IL-6 stimulated CalSS cells (the cell cycle states 
identified in the legend). Inset DNA histogram of the cumulative 
population shows the mapping of DNA intensities to cell cycle 
states. 
(TIF) 

Figure S5 Western Blot Analysis of Receptor Expres- 
sion. A) Western blots of receptor expression in five cell lines, with 
Tubulin as a control. B) Quantitation of total density in western 
blot bands for the IL-6 receptor, relative to CalSS cells. C) 
Quantitation of OSM receptor expression, relative to CalSS. D) 
Quantitation of IFNy receptor expression, relative to CalSS. 
(TIF) 

Figure S6 STATS activation by Combinations of Cyto- 
kines. To assess the interaction between the IL-6 and OSM 
pathways, CalSS cells were exposed to combinations of lL-6 and 
OSM for 15 min. Top Row) The activation of STATS by IL-6 
alone. Left Column) The activation of STATS by OSM alone. 
The arrows point to the location of the population non-responsive 
to IL-6. Note that with the addition of OSM aU cells treated with 
IL-6 show activated STATS. 
(TIF) 

Figure S7 Evaluation of Potential indices of Diversity 
and Normality. A) Model distributions and cell data were used 
to evaluate the performance of selected metrics for characterizing 



the distributions. The 50:50 mix consists of 2 unit normal 
distributions of equal population that are separated by 'd' standard 
deviations (sd). The 90:10 mix consists of 2 unit normal 
distributions with 90% and 10% of the population, separated by 
'd' standard deviations. B) Selected Diversity and Normality 
indices were used to evaluate the model distributions for values of 
'd' ranging from 0-10 sd, and the CalSS data for IL-6 and OSM 
stimulation. For the CalSS data, the diversity indices all show 
similar response, while the model data show some key differences. 
The IQR (interquartile range) is not sensitive to the small 10% 
subpopulation, and the SE (Shannon entropy) and DE (Differen- 
tial entropy, the Shannon entropy for a continuous distribution 
function) both plateau when the 2 populations sc-parate. Only the 
Q_E (Quadratic entropy) shows a steady increase for both 
distributions. Again, the general pattern of the 'Normality' 
measures is similar for the CalSS data, but the model data show 
key differences. The skewness and MMR (mean/median) are 
insensitive to the 50:50 population because it is symmetric. The 
kurtosis and KS statistic are sensitive to the variation in both 
distributions, however the KS was preferred due to its direct 
interpretation as a measure of normality. 
(TIF) 

Figure SS Dose dependence of inhibitors of STATS 

activation by IL-6. STATS activation in CalSS cells is inhibited 
by Pyridone-6 (•) widi an IC50 = 66 nM or STATTIC (■) with an 
IC50 = 10 mM. 
(TIF) 

Figure S9 Integrating heterogeneity analysis into phe- 
notypic screening. HeterogerK'ity indices are evaluated during 
assay development and thresholds determined based on the goals 
of the project. For drug discovery and cell/tissue profiling 
programs that encounter phenotypic heterogeneity, HCS images 
are analyzed to generate the features, statistical parameters and 
Hi's. For samples or treatments with low Diversity (DIV) or a 
normal distribution (low nNRM) standard statistics can be used. A 
well or sample with a high HI and high nNRM or high %OL 
would require more detailed analysis of the heterogeneity. If the 
observed heterogeneity is biologicaUy important in the context of 
the project, further experiments aimed at understanding its 
mechanism may lead to discovery of new targets or diagnostic 
biomarkers. For pyridone-6, the Dl\' indc'x for concentrations 
below 5 |jM indicates a high degri;e of heterogeneity (HI>0.0S 
from Figur(; 5) which is further characterized as macro- 
heterogeneity since the nNRM incfices are >0.5. At 5 |i,M the 
DIV indicates a homogeneous population with low heterogeneity. 
In all cases the %OL is below the HI threshold in Figure 5. The 
high heterogeneity indices suggest further studies are needed to 
understand the activity of pyridone-6 on these cells. 
(TIF) 

DataArchive SI Data for the distribution of STATS 
activation in 5 cell lines stimulated with 10 concentra- 
tions of IL-6 or OSM as plotted in Figure 2. Data provided 
as a ZIP archive. 

(ZIP) 

DataArchive S2 Data for the inhibition of IL-6 induced 
STATS activation by Pyridone-6 and Stattic as plotted in 
Figure 7. Data provided as a ZIP archive. 

(ZIP) 

Table SI Reproducibility of Intensity Measures. Flow 

cytometry standard beads and CalSS cells were used to quantify 
the reproducibility of imaging intensity measurements on cells and 
cell sized objects. Samples of beads or cells were split and run on 
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either the ArraySt:an HCA system or a flow cytometcr for 
reference. For beads, Ratio is (mean doublet total intensity)/ (mean 
singlet total intensity). For Gal33 cells, Ratio is (mean G2/M total 
nuclear intensity)/(mean GO/Gl total nuclear intensity). 
(DOCX) 

Table S2 Power analysis of HI measures . Replicate 

measures on 3 different days were used to determine the number 
of cells required to achieve a power of 0.8 for the CV, KS and QE 
measures of the distributions of STAT3 activity in Gal33 cells. 
(DOCX) 
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